Improving the Compression Efficiency for News Web Service Using Semantic Relations Among Webpages

نویسندگان

  • Xiao Wei
  • Xiangfeng Luo
  • Qing Li
چکیده

Both compression and decompression play important roles in a web service system. High compression ratio helps to save the storage, while fast decompression contributes to decreasing the response time of service. Specifically focusing on the news web service, this paper proposes a compression mechanism to improve the efficiency of compression and decompression simultaneously by taking advantage of the semantic relations among webpages. Firstly, webpages are clustered into news topics according to the similar semantic relation among webpages. Webpages belonging to the same topic have much duplicate content, which can improve the compression ratio when using delta-compression. Secondly, associated news topics are detected with the help of multiple-semantic link network of news topics. Associated topics are compressed into the same zip file which may decrease the times of decompression according to the habit of a user’s reading news on the Web. The authors apply the proposed compression mechanism to a practical news search engine and the experimental results show that it has high compression ratio and fast decompression speed as well. Improving the Compression Efficiency for News Web Service Using Semantic Relations Among Webpages

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A procedure for Web Service Selection Using WS-Policy Semantic Matching

In general, Policy-based approaches play an important role in the management of web services, for instance, in the choice of semantic web service and quality of services (QoS) in particular. The present research work illustrates a procedure for the web service selection among functionality similar web services based on WS-Policy semantic matching. In this study, the procedure of WS-Policy publi...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJCINI

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2013